Influence of Outliers on Accuracy Estimation in Genomic Prediction in Plant Breeding
نویسندگان
چکیده
Outliers often pose problems in analyses of data in plant breeding, but their influence on the performance of methods for estimating predictive accuracy in genomic prediction studies has not yet been evaluated. Here, we evaluate the influence of outliers on the performance of methods for accuracy estimation in genomic prediction studies using simulation. We simulated 1000 datasets for each of 10 scenarios to evaluate the influence of outliers on the performance of seven methods for estimating accuracy. These scenarios are defined by the number of genotypes, marker effect variance, and magnitude of outliers. To mimic outliers, we added to one observation in each simulated dataset, in turn, 5-, 8-, and 10-times the error SD used to simulate small and large phenotypic datasets. The effect of outliers on accuracy estimation was evaluated by comparing deviations in the estimated and true accuracies for datasets with and without outliers. Outliers adversely influenced accuracy estimation, more so at small values of genetic variance or number of genotypes. A method for estimating heritability and predictive accuracy in plant breeding and another used to estimate accuracy in animal breeding were the most accurate and resistant to outliers across all scenarios and are therefore preferable for accuracy estimation in genomic prediction studies. The performances of the other five methods that use cross-validation were less consistent and varied widely across scenarios. The computing time for the methods increased as the size of outliers and sample size increased and the genetic variance decreased.
منابع مشابه
Accuracy of Genomic Prediction under Different Genetic Architectures and Estimation Methods
The accuracy of genomic breeding value prediction was investigated in various levels of reference population size, trait heritability and the number of quantitative trait locus (QTL). Five Bayesian methods, including Bayesian Ridge regression, BayesA, BayesB, BayesC and Bayesian LASSO, were used to estimate the marker effects for each of 27 scenarios resulted from combining three levels for her...
متن کاملEffect of Markers Effect Estimation Methods, Population Structure and Trait Architercture on the Accuracy of Genomic Breeding Values
This study aimed to investigate the effect of the method of estimating the effects of markers , QTLs distribution, number of QTLs, effective population size and trait heritability on the accuracy of genomic predictions. Two effective population sizes, 100 and 500 individuals, were simulated by QMSim software. A 100 cM genome including one chromosome was simulated where 500 SNPs and two diffe...
متن کاملComparing Different Marker Densities and Various Reference Populations Using Pedigree-Marker Best Linear Unbiased Prediction (BLUP) Model
In order to have successful application of genomic selection, reference population and marker density should be chosen properly. This study purpose was to investigate the accuracy of genomic estimated breeding values in terms of low (5K), intermediate (50K) and high (777K) densities in the simulated populations, when different scenarios were applied about the reference populations selecting. Af...
متن کاملComparison of Single and Multi-Step Bayesian Methods for Predicting Genomic Breeding Values in Genotyped and Non-Genotyped Animals- A Simulation Study
The purpose of this study was to compare the accuracy of genomic evaluation for Bayes A, Bayes B, Bayes C and Bayes L multi-step methods and SSBR-C and SSBR-A single-step methods in the different values of π for predicting genomic breeding values of the genotyped and non-genotyped animals. A genome with 40000 SNPs on the 20 chromosom was simulated with the same distance (100cM). The π valu...
متن کاملارزیابی صحت پیشبینی ژنومی در معماریهای مختلف ژنومی صفات کمی و آستانهای با جانهی دادههای ژنومی شبیهسازیشده، توسط روش جنگل تصادفی
Genomic selection is a promising challenge for discovering genetic variants influencing quantitative and threshold traits for improving the genetic gain and accuracy of genomic prediction in animal breeding. Since a proportion of genotypes are generally uncalled, therefore, prediction of genomic accuracy requires imputation of missing genotypes. The objectives of this study were (1) to quantify...
متن کامل